Neural Embedding Allocation: Distributed Representations of Topic Models

نویسندگان

چکیده

Abstract We propose a method that uses neural embeddings to improve the performance of any given LDA-style topic model. Our method, called embedding allocation (NEA), deconstructs models (LDA or otherwise) into interpretable vector-space words, topics, documents, authors, and so on, by learning mimic demonstrate NEA improves coherence scores original model smoothing out noisy topics when number is large. Furthermore, we show NEA’s effectiveness generality in deconstructing LDA, author-topic models, recent mixed membership skip-gram achieve better with compared several state-of-the-art models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concurrent Inference of Topic Models and Distributed Vector Representations

Topic modeling techniques have been widely used to uncover dominant themes hidden inside an unstructured document collection. Though these techniques first originated in the probabilistic analysis of word distributions, many deep learning approaches have been adopted recently. In this paper, we propose a novel neural network based architecture that produces distributed representation of topics ...

متن کامل

Discriminative Neural Topic Models

We propose a neural network based approach for learning topics from text and image datasets. The model makes no assumptions about the conditional distribution of the observed features given the latent topics. This allows us to perform topic modelling efficiently using sentences of documents and patches of images as observed features, rather than limiting ourselves to words. Moreover, the propos...

متن کامل

Incorporating Topic Priors into Distributed Word Representations

Representing words as continuous vectors enables the quantification of semantic relationships of words by vector operations, thereby has attracted much attention recently. This paper proposes an approach to combine continuous word representation and topic modeling, by encoding words based on their topic distributions in the hierarchical softmax, so as to introduce the prior semantic relevance i...

متن کامل

Distributed Algorithms for Topic Models

We describe distributed algorithms for two widely-used topic models, namely the Latent Dirichlet Allocation (LDA) model, and the Hierarchical Dirichet Process (HDP) model. In our distributed algorithms the data is partitioned across separate processors and inference is done in a parallel, distributed fashion. We propose two distributed algorithms for LDA. The first algorithm is a straightforwar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computational Linguistics

سال: 2022

ISSN: ['1530-9312', '0891-2017']

DOI: https://doi.org/10.1162/coli_a_00457